Enhancing VLAM Workflow Model with MapReduce Operations

نویسندگان

  • Mikolaj Baranowski
  • Adam Belloum
  • Marian Bubak
چکیده

Bibliography [1] A. Belloum, M. Inda, D. Vasunin, V. Korkhov, Z. Zhao, H. Rauwerda, T. Breit, M. Bubak, L. Hertzberger, Collaborative e-science experiments and scientific workflows, Internet Computing, IEEE 15 (2011) 39–47. [2] C. Olston, B. Reed, U. Srivastava, R. Kumar, A. Tomkins, Pig Latin: a not-so-foreign language for data processing, in: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD ’08, ACM, New York, NY, USA, 2008, 1099–1110. [3] R. Pike, S. Dorward, R. Griesemer, S. Quinlan, Interpreting the data: Parallel analysis with Sawzall, Scientific Programming 13 (2005) 277. [4] M. Baranowski, A. Belloum, M. Bubak, M. Malawski, Constructing workflows from script applications, Scientific Programming 20 (2012) 359–377 [5] M. Baranowski, A. Belloum, M. Bubak, MapReduce operations with WS-VLAM Workflow Management System, Procedia Computer Science 18 (2013) 2599-2602 Objective Provide an easy to use and efficient Domain Specific Language for defining MapReduce operations in workflows.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MapReduce Operations with WS-VLAM Workflow Management System

Workflow management systems are widely used to solve scientific problems as they enable orchestration of remote and local services such as database queries, job submission and running an application. To extend the role that workflow systems play in data-intensive science, we propose a solution that integrates WMS and MapReduce model. In this paper, we discuss possible solution of combining MapR...

متن کامل

Improving Current Hadoop MapReduce Workflow and Performance

This study proposes an improvement andimplementation of enhanced Hadoop MapReduce workflow that develop the performance of the current Hadoop MapReduce. This architecture speeds up the process of manipulating BigData by enhancing different parameters in the processing jobs. BigData needs to be divided into many datasets or blocks and distributed to many nodes within the cluster. Thus, tasks can...

متن کامل

Bind: a Partitioned Global Workflow Parallel Programming Model

High Performance Computing is notorious for its long and expensive software development cycle. To address this challenge, we present Bind: a ”partitioned global workflow” parallel programming model for C++ applications that enables quick prototyping and agile development cycles for high performance computing software targeting heterogeneous distributed manycore architectures. We present applica...

متن کامل

Adaptive Information Passing for Early State Pruning in MapReduce Data Processing Workflows

MapReduce data processing workflows often consist of multiple cycles where each cycle hosts the execution of some data processing operators e.g., join, defined in a program. A common situation is that many data items that are propagated along in a workflow, end up being “fruitless” i.e. they do not contribute to the final output. Given that the dominant costs associated with MapReduce processin...

متن کامل

Constructing gazetteers from volunteered Big Geo-Data based on Hadoop

Traditional gazetteers are built and maintained by authoritative mapping agencies. In the age of Big Data, it is possible to construct gazetteers in a data-driven approach by mining rich volunteered geographic information (VGI) from the Web. In this research, we build a scalable distributed platform and a high-performance geoprocessing workflow based on the Hadoop ecosystem to harvest crowd-sou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013